Efficient-UCBV: An Almost Optimal Algorithm using Variance Estimates
نویسندگان
چکیده
We propose a novel variant of the UCB algorithm (referred to as Efficient-UCB-Variance (EUCBV)) for minimizing cumulative regret in the stochastic multi-armed bandit (MAB) setting. EUCBV incorporates the arm elimination strategy proposed in UCB-Improved (Auer and Ortner, 2010), while taking into account the variance estimates to compute the arms’ confidence bounds, similar to UCBV (Audibert, Munos, and Szepesvári, 2009). Through a theoretical analysis we establish that EUCBV incurs a gapdependent regret bound of O ( Kσ max log(T∆ /K) ∆ ) after T trials, where ∆ is the minimal gap between optimal and suboptimal arms; the above bound is an improvement over that of existing state-of-the-art UCB algorithms (such as UCB1, UCB-Improved, UCBV, MOSS). Further, EUCBV incurs a gap-independent regret bound of O (√ KT ) which is an improvement over that of UCB1, UCBV and UCB-Improved, while being comparable with that of MOSS and OCUCB. Through an extensive numerical study we show that EUCBV significantly outperforms the popular UCB variants (like MOSS, OCUCB, etc.) as well as Thompson sampling and Bayes-UCB algorithms.
منابع مشابه
Identification of outliers types in multivariate time series using genetic algorithm
Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...
متن کاملMathematical Analysis of Optimal Tracking Interval Management for Power Efficient Target Tracking Wireless Sensor Networks
In this paper, we study the problem of power efficient tracking interval management for distributed target tracking wireless sensor networks (WSNs). We first analyze the performance of a distributed target tracking network with one moving object, using a quantitative mathematical analysis. We show that previously proposed algorithms are efficient only for constant average velocity objects howev...
متن کاملAn empirical study on statistical analysis and optimization of EDM process parameters for inconel 718 super alloy using D-optimal approach and genetic algorithm
Among the several non-conventional processes, electrical discharge machining (EDM) is the most widely and successfully applied for the machining of conductive parts. In this technique, the tool has no mechanical contact with the work piece and also the hardness of work piece has no effect on the machining pace. Hence, this technique could be employed to machine hard materials such as super allo...
متن کاملImproved Split-Plot and Multistratum Designs
Many industrial experiments involve some factors whose levels are harder to set than others. The best way to deal with these is to plan the experiment carefully as a split-plot, or more generally a multi-stratum, design. Several different approaches for constructing splitplot type response surface designs have been proposed in the literature since 2001, which has allowed experimenters to make b...
متن کاملAn Efficient Algorithm for Output Coding in Pal Based Cplds (TECHNICAL NOTE)
One of the approaches used to partition inputs consists in modifying and limiting the input set using an external transcoder. This method is strictly related to output coding. This paper presents an optimal output coding in PAL-based programmable transcoders. The algorithm can be used to implement circuits in PAL-based CPLDs.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1711.03591 شماره
صفحات -
تاریخ انتشار 2017